WT-LDA: User Tagging Augmented LDA for Web Service Clustering
نویسندگان
چکیده
Clustering Web services that groups together services with similar functionalities helps improve both the accuracy and efficiency of the Web service search engines. An important limitation of existing Web service clustering approaches is that they solely focus on utilizing WSDL (Web Service Description Language) documents. There has been a recent trend of using user-contributed tagging data to improve the performance of service clustering. Nonetheless, these approaches fail to complete leverage the information carried by the tagging data and hence only trivially improve the clustering performance. In this paper, we propose a novel approach that seamlessly integrates tagging data and WSDL documents through augmented Latent Dirichlet Allocation (LDA). We also develop three strategies to pre-process tagging data before being integrated into the LDA framework for clustering. Comprehensive experiments based on real data and the implementation of a Web service search engine demonstrate the effectiveness of the proposed LDA-based service clustering approach.
منابع مشابه
LDA based User-Tag model for Automatic Image Geo-Tagging
Determining the precise location of the immense amounts of visual data on the internet would be beneficial to many applications, such as construction of detailed 3D location models and providing location-based services. In this work, we propose two latent dirichlet allocation (LDA)-based approaches to model user-tags for automatic geo-tagging of images. Specifically, as a first step in the task...
متن کاملWeb Event Topic Analysis by Topic Feature Clustering and Extended LDA Model
To analyze topics of a large number of web events, we proposed an event topic analysis approach by topic feature clustering and extended LDA (latent dirichlet allocation) model. The extended LDA model is dimension LDA (DLDA) which integrates topic probability of LDA model. We represent an event as a multi-dimensions vector and use DLDA model to select topic feature words in events. We aggregate...
متن کاملModeling and Leveraging Social Collective Intelligence
The rise of social interactions on the Web requires developing new methods of information organization and discovery. To that end, we propose a generative community-based probabilistic tagging model that can automatically uncover communities of users and their associated tags. We experimentally validate the quality of the discovered communities over the social bookmarking system Delicious. In c...
متن کاملWSSE: A Web Service Search Engine for Large Scale Web Service Discovery based on the Probabilistic Topic Modeling and Clustering
With the ever increasing number of web services, discovering the appropriate web service requested by users has become a vital yet challenging task. The aim of this project is to provide an efficient search engine that can retrieve the most relevant web services in a short time. The proposed search engine WSSE is based on the probabilistic topic modeling and clustering techniques that are integ...
متن کاملWeb-Scale Image Annotation
In this paper, we describe our experiments using Latent Dirichlet Allocation (LDA) to model images containing both perceptual features and words. To build a large-scale image tagging system, we distribute the computation of LDA parameters using MapReduce. Empirical study shows that our scalable LDA supports image annotation both effectively and efficiently.
متن کامل